[ML] Don't block thread while waiting for work to finish on graceful shutdown #135350

davidkyle · 2025-09-24T12:57:31Z

A model deployment that is gracefully shutdown will wait until the queue up work is done (or timeout) before terminating the inference process. Waiting for the work to complete should not block a thread as it may be blocked for up to 5 minutes. The change here adds a call back to the worker queue to terminate the process once it has completed.

Follow on from #134673

# Conflicts: # x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/assignment/TrainedModelAssignmentNodeService.java # x-pack/plugin/ml/src/test/java/org/elasticsearch/xpack/ml/inference/assignment/TrainedModelAssignmentNodeServiceTests.java

...plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/deployment/ShutdownTracker.java

elasticsearchmachine · 2025-09-30T09:06:02Z

Pinging @elastic/ml-core (Team:ML)

...ugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/deployment/DeploymentManager.java

On graceful shutdown don't block thread while waiting for work to finish

8357159

davidkyle added >refactoring :ml Machine learning labels Sep 24, 2025

elasticsearchmachine added the v9.2.0 label Sep 24, 2025

davidkyle added 2 commits September 29, 2025 17:26

fix tests after conflict

ca896c5

jonathan-buttner reviewed Sep 29, 2025

View reviewed changes

...plugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/deployment/ShutdownTracker.java Outdated Show resolved Hide resolved

Use a timeout listener

2ac5031

davidkyle marked this pull request as ready for review September 30, 2025 09:05

elasticsearchmachine added the Team:ML Meta label for the ML team label Sep 30, 2025

davidkyle enabled auto-merge (squash) September 30, 2025 09:18

Merge branch 'main' into graceful-stop-listener

072e0b6

jonathan-buttner reviewed Sep 30, 2025

View reviewed changes

...ugin/ml/src/main/java/org/elasticsearch/xpack/ml/inference/deployment/DeploymentManager.java Outdated Show resolved Hide resolved

fix delegate

e226379

jonathan-buttner approved these changes Sep 30, 2025

View reviewed changes

davidkyle added the cloud-deploy Publish cloud docker image for Cloud-First-Testing label Sep 30, 2025

davidkyle added 3 commits September 30, 2025 16:29

Merge branch 'main' into graceful-stop-listener

01e8acc

Merge branch 'main' into graceful-stop-listener

8caf55b

Merge branch 'main' into graceful-stop-listener

2d23388

davidkyle merged commit 6d2c3ef into elastic:main Oct 1, 2025
35 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[ML] Don't block thread while waiting for work to finish on graceful shutdown #135350

[ML] Don't block thread while waiting for work to finish on graceful shutdown #135350

davidkyle commented Sep 24, 2025 •

edited

Loading

Uh oh!

Uh oh!

elasticsearchmachine commented Sep 30, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[ML] Don't block thread while waiting for work to finish on graceful shutdown #135350

[ML] Don't block thread while waiting for work to finish on graceful shutdown #135350

Conversation

davidkyle commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

elasticsearchmachine commented Sep 30, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

davidkyle commented Sep 24, 2025 •

edited

Loading